Now that we have many foundational elements identified and practiced - such as generating code to explore data, cleaning data for analysis, and some elements of theory construction - we can begin focusing on some of the important technical components of model building and analysis: interpretation.
Interpretation relies very heavily on both your research question and the subsequent empirical study.
While your research question may be based on a host of factors, your empirical study relies on a combination of:
The below research questions highlight the intersection of social justice issues in multiple variable quantitative analysis. Keep in mind that these questions can be further refined and tailored to specific contexts or issues of interest within the realm of social justice.
How does income inequality and geographical location affect access to quality education?
What disparities in the criminal justice system by race and gender?
How does gender discrimination and age impact career advancement in the workplace?
What are the effects of housing policies and income on residential segregation and access to affordable housing?
How does healthcare accessibility and affordability vary across different socioeconomic groups?
Sample analysis
Let us continue with a sample analysis.
We will assume that state data collected for a sample of 100 randomly selected cities requesting funding after the approval of a new bill on affordable housing. The data set includes three key variables.
Research question
What is the relationship between state funding for affordable housing initiatives and the availability of new affordable housing units?
Details about each variable are provided below:
city is a marker (which matches the data index) used to indicate a randomly selected city.
funding is the total amount of funding provided to families (in thousands of dollars) in a given 3-week period
housing_availability is the average of city housing units allocated over the same funding period
advocacy is the average number of calls to the state representatives’ hotline four months prior
The advocacy variable was generated as a result of a similar study conducted in a neighboring state, which noticed that there was a potential lag-relationship between advocacy and funding allocations approved at the state-level.
city funding housing_availability advocacy
Min. : 1.00 Min. :216.2 Min. :34.60 Min. :16.19
1st Qu.: 25.75 1st Qu.:280.4 1st Qu.:48.54 1st Qu.:22.52
Median : 50.50 Median :343.4 Median :54.97 Median :27.04
Mean : 50.50 Mean :360.4 Mean :54.54 Mean :26.63
3rd Qu.: 75.25 3rd Qu.:437.7 3rd Qu.:59.55 3rd Qu.:30.30
Max. :100.00 Max. :547.4 Max. :77.86 Max. :37.34
Exploration
We can use some base-R commands to get a quick summary of each variable.
# get plots of variableshist(funding)
hist(housing_availability)
Exploration
# get summary statistics for variablessummary(funding)
Min. 1st Qu. Median Mean 3rd Qu. Max.
216.2 280.4 343.4 360.4 437.7 547.4
summary(housing_availability)
Min. 1st Qu. Median Mean 3rd Qu. Max.
34.60 48.54 54.97 54.54 59.55 77.86
We can also produce quick plots to examine the relationship between each variable.
Here, we include code to get the correlation coefficient.
First, researchers decided to run a linear regression model on housing_availability and funding.
# perform linear regression analysismodel1 <-lm(housing_availability ~ funding)# summary of the regression modelsummary(model1)
Call:
lm(formula = housing_availability ~ funding)
Residuals:
Min 1Q Median 3Q Max
-18.1793 -5.9060 -0.6551 5.0543 22.4049
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 45.49080 3.41826 13.308 < 2e-16 ***
funding 0.02511 0.00918 2.736 0.00739 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 8.579 on 98 degrees of freedom
Multiple R-squared: 0.07095, Adjusted R-squared: 0.06147
F-statistic: 7.484 on 1 and 98 DF, p-value: 0.007391
Plot the data and regression line
ggplot(data, aes(x = funding, y = housing_availability)) +geom_point() +geom_smooth(method ="lm", se =FALSE) +labs(x ="City Funding", y ="Housing Availability", title ="Relationship between City Funding and Housing Availability")
One researcher, however, suggested that a more robust regression analysis should be used with OLS techniques. Robust regression analysis, as you may recall, helps us reduce outlier effects.
Note: we need to load the MASS package and library to run the following code.
The decisions were made based on the following notes:
Cook’s distance cooks.distance() provides a measure of the influence of a data point when performing regression.
stdres standardized the residuals from our model
cbind() attaches the two measures to our data frame
We can use a cutoff point \(4/n\) where \(n\) is the sample size recommend by others to select the values to display.
We then get the absolute value of the residuals (remember that the sign does not matter in distance), and we print the observations with the highest residuals (here we focus on the top 10 values).
We do this by using the rlm() function in the MASS package.
There are several weights that can be used for the iterated re-weighted least squares technique (IRLS)1.
rrmodel <-rlm(housing_availability ~ funding, data = data)summary(rrmodel)
Call: rlm(formula = housing_availability ~ funding, data = data)
Residuals:
Min 1Q Median 3Q Max
-17.9198 -5.4416 -0.3424 5.2609 22.8048
Coefficients:
Value Std. Error t value
(Intercept) 45.7779 3.5798 12.7880
funding 0.0234 0.0096 2.4328
Residual standard error: 8.213 on 98 degrees of freedom
The default weight is the Huber weight.
Huber weights are a type of weight function used to downweight or mitigate the influence of outliers on the estimation procedure.
In traditional least squares regression, all data points are given equal weight, and the estimation procedure is sensitive to the presence of outliers. The use of weights in our robust regression model aims to provide more robust estimates by assigning different weights to the observations, giving less influence to outliers.
Huber weights assign larger weights to observations that are close to the regression line and smaller weights to observations that deviate significantly from the line. The weight assigned to each observation depends on its residuals (the difference between the observed values and the predicted values).
Causality
Despite our work on the initial model, the issue of causality needs to be discussed.
There are a few considerations that need to be taken into account:
Confounding variables: There may be other factors that influence the model apart from city funding. For example, economic conditions, housing availability, and social policies can also play significant roles. Failing to account for these confounding variables may lead to erroneous conclusions about the causal relationship.
Reverse causality: The relationships can be bidirectional. Higher housing availability rates may lead to increased city funding directed at addressing the issue. Thus, it’s possible that the relationship is driven by reverse causality, where higher levels of housing availability cause increased funding rather than the other way around.
Omitted variable bias: There may be unobserved or unmeasured factors that affect both city funding and housing availability. Failing to include these variables in the analysis can lead to omitted variable bias, potentially distorting the estimated relationships.
Ecological fallacy: Analyzing aggregated data across the state- and city- levels may not capture the correct level of nuances within the relationship. Aggregating data can lead to an ecological fallacy, where conclusions made at the aggregate level may not hold true at different levels.
Multicollinearity
Multicollinearity refers to a high correlation or linear relationship between two or more predictor variables in a regression model. In the case of three variables, multicollinearity occurs when there is a strong linear relationship between any pair of the three variables, making it difficult to separate their individual effects on the response variable. This can cause instability in the regression model, inflated standard errors, and difficulties in interpreting the coefficients.
Assume we updated our theoretical statement and research question and add the advocacy variable to our model.
# perform linear regression analysismodel2 <-lm(housing_availability ~ funding + advocacy)# summary of the regression modelsummary(model2)
Call:
lm(formula = housing_availability ~ funding + advocacy)
Residuals:
Min 1Q Median 3Q Max
-17.9890 -6.1250 -0.6158 4.9763 22.3024
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 44.80516 4.77827 9.377 2.97e-15 ***
funding 0.02408 0.01049 2.296 0.0238 *
advocacy 0.03969 0.19229 0.206 0.8369
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 8.621 on 97 degrees of freedom
Multiple R-squared: 0.07136, Adjusted R-squared: 0.05221
F-statistic: 3.727 on 2 and 97 DF, p-value: 0.02759
Interaction effects
Next, we add an interaction term to our model.
# get a summary of the advocacy datasummary(advocacy)
Min. 1st Qu. Median Mean 3rd Qu. Max.
16.19 22.52 27.04 26.63 30.30 37.34
# examine the relationship between funding and advocacycor(advocacy, funding)
[1] 0.4757307
# perform linear regression analysismodel3 <-lm(housing_availability ~ funding + advocacy + funding*advocacy)# summary of the regression modelsummary(model3)
Call:
lm(formula = housing_availability ~ funding + advocacy + funding *
advocacy)
Residuals:
Min 1Q Median 3Q Max
-17.9963 -6.2218 -0.5457 4.8889 22.3465
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 49.0944885 17.5511591 2.797 0.00623 **
funding 0.0117777 0.0495659 0.238 0.81268
advocacy -0.1236422 0.6712607 -0.184 0.85425
funding:advocacy 0.0004576 0.0018009 0.254 0.79997
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 8.663 on 96 degrees of freedom
Multiple R-squared: 0.07198, Adjusted R-squared: 0.04298
F-statistic: 2.482 on 3 and 96 DF, p-value: 0.06555
Please note that we may need to run additional tests or more robust models to inform interpretation.
Statistical vs. practical significance
When analyzing the relationship between state funding and housing availability, it is important to consider both statistical significance and practical significance.
Statistical significance refers to the likelihood that the observed relationship or difference between variables is not due to chance. It is determined through statistical tests, such as hypothesis testing or p-values. In this context, statistical significance would indicate whether there is evidence to suggest that state funding has a statistically significant effect on housing availability. A statistically significant result suggests that the relationship between the variables is unlikely to have occurred by random chance.
Practical significance focuses on the magnitude or practical importance of the observed relationship. It asks whether the observed effect size is meaningful or substantial in real-world terms. In the case of state funding and housing availability, practical significance would involve evaluating whether the observed impact of state funding on housing availability is large enough to have a meaningful or substantial effect on the availability of housing units.
Note, however, that while statistical significance provides evidence of a relationship, it does not necessarily imply practical importance. A statistically significant relationship may exist but have a negligible or trivial effect in practice. Conversely, a relationship may have practical significance, even if it does not reach statistical significance due to limited sample size or other factors.
Replication studies
Exploring varied statistical outputs and their significance in a social justice context requires care, both in terms of the underlying theories that relate to the variables themselves and their use across different context. An additional factor that we have discussed relates to the role of the theoretical constructions and their applicability to issues of social injustice.
More often than not, caution should take the lead when developing new models. In these instances, some variation on what is known as a replication study can become a valuable tool. A replication study is a type of study that aims to reproduce or replicate the findings of a previous study. In the context of our course, the replication frameworks can be applied to examine the relationships between variables across contexts and different populations.
There are different types of replication studies.
Direct replication: In this replication study type, researchers attempt to reproduce the original study as closely as possible, meaning they follow the same research design, methodologies, and data analysis procedures.
Partial replication: In this replication study type, researchers attempt to replicate only a portion of the original study. Often, researchers doing a partial replication study focus on a specific aspect, variable, or component of the study.
Conceptual replication: In this replication study type, researchers conduct a replication analysis that focuses on the same research question(s) but through the use of different methods, measures, or population groups.
While replication studies are often used to help ensure the credibility and seeming generalizations found in statistical research findings, they can also serve as a part of a broader process to examine the role of context in statistical models. Importantly, failure to replicate the findings of a study do not mean that the original study findings were incorrect or flawed. Together, these types of explorations can contribute to scientific knowledge and provide evidence to help us understand the role of theory and the practice of social justice.
Beyond regression
Researchers have access to a wide range of advanced statistical techniques and methodologies that provide deeper insights into complex relationships and patterns within data. These approaches go beyond the linear relationships examined in regression analysis and allow researchers to explore non-linear, interactive, and dynamic effects among variables. By utilizing these advanced techniques, researchers can uncover hidden patterns, make more accurate predictions, account for complex interactions, and gain a more comprehensive understanding of the phenomena under investigation.
Some of these methods often provide greater flexibility in handling missing data, dealing with outliers, and accommodating various types of data structures. Overall, the utilization of these advanced statistical techniques expands the availability of tools to consider ways to delve deeper into the complexities of their data and extract meaningful insights.
Part II: Content
Multiple Variable Analysis and Multivariate Analysis are two terms often used in statistics and research methodology to describe different approaches to analyzing data involving multiple variables. While they share similarities, there are distinct differences between these two concepts.
Multivariable vs. Multivariate
Multiple variable analysis investigates the influence of individual independent variables on a single dependent variable, while multivariate analysis explores the relationships and patterns among multiple variables simultaneously.
Multiple Variable Analysis is often used when studying the effects of specific factors, while multivariate analysis is employed to uncover broader patterns and structures within a dataset. Both approaches are valuable in data analysis, and the choice between them depends on the research objectives and the nature of the data being analyzed.
Definitions: Multiple variable analysis vs. Multivariate analysis
Multiple Variable Analysis: Multiple Variable Analysis refers to the process of examining the relationships between several independent variables and a single dependent variable. It aims to understand how each independent variable influences or predicts the dependent variable individually, while controlling for other variables. In this analysis, each independent variable is analyzed separately, often using techniques such as regression analysis or analysis of variance (ANOVA).
Multivariate Analysis: Multivariate Analysis involves the simultaneous analysis of multiple dependent and independent variables. It aims to explore the relationships and patterns among multiple variables, considering them as a whole. This analysis technique allows for the examination of complex interactions and associations between variables, providing a more comprehensive understanding of the data.
Key characteristics of multiple variable analysis
Focus: Examining the impact of individual independent variables on a single dependent variable.
Analytic approach: Each independent variable is analyzed separately, allowing for isolation of their effects.
Purpose: To identify the individual contributions and significance of multiple variables in explaining the variation in the dependent variable.
Statistical techniques: Common techniques include simple linear regression, multiple linear regression, and ANOVA.
Key characteristics of multivariate analysis
Focus: Examining the relationships and interactions among multiple variables simultaneously.
Analytic approach: Considering all variables together, accounting for their joint effects and potential interdependence.
Purpose: To explore patterns, associations, and structures within the data, identifying underlying factors or dimensions.
Statistical techniques: Common techniques include factor analysis, principal component analysis, cluster analysis, and structural equation modeling.
Examples of multivariate analysis techniques
Principal component analysis (PCA): PCA is used to reduce the dimensionality of data by transforming it into a new set of uncorrelated variables called principal components. R functions for PCA include prcomp() and princomp().
Factor analysis: Factor Analysis aims to identify latent factors that explain the correlations among observed variables. R offers functions like factanal() and psych::fa() for conducting factor analysis.
Canonical correlation analysis (CCA): CCA examines the relationships between two sets of variables and identifies the linear combinations of each set that have maximum correlation with each other. The CCA() function in the stats package can be used for this analysis.
Cluster analysis: Cluster Analysis groups similar observations into clusters based on the similarity of their characteristics. R provides various clustering techniques, such as k-means clustering (kmeans()), hierarchical clustering (hclust()), and model-based clustering (Mclust()).
Discriminant analysis: Discriminant Analysis aims to find a linear combination of variables that maximally separate predefined groups or classes. R offers functions like lda() and qda() for performing Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA), respectively.
Multivariate regression: Multivariate Regression extends simple linear regression to multiple response variables. The lm() function in R can be used for multivariate regression analysis.
Multivariate analysis of variance (MANOVA): MANOVA extends the analysis of variance (ANOVA) to multiple response variables simultaneously. The manova() function in R can be used for MANOVA.
Multidimensional scaling (MDS): MDS visualizes the similarity or dissimilarity between objects in a lower-dimensional space. R provides functions like cmdscale() and isoMDS() for performing MDS.
Structural Equation Modeling (SEM): SEM is a comprehensive framework for testing complex relationships among variables. R packages like lavaan and sem offer functionalities for conducting SEM.
Correspondence Analysis: Correspondence Analysis explores the associations between categorical variables and visualizes them in a low-dimensional space. The ca() function in the ca package is commonly used for correspondence analysis.
We will consider a few of these models in our final weeks for the course.
Part III: Code
This week, we use some standard data included in R to further discuss model interpretation.
While these data sets do not directly connect to the content of our course, they provide some useful examples to return to as they are discussed on many websites that use R and that can be found in online forums.
Each example illustrates different scenarios for interpreting linear models using the summary output. Remember to consider coefficients, standard errors, t-values, and p-values to assess the significance and direction of relationships between predictors and the response variable. Additionally, theory construction and relevant knowledge and context are crucial for a comprehensive interpretation of the results.
This data is from the 1974 Motor Trend US magazine. The data set comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). You could run similar models using data in the critstats package.
mpg cyl disp hp
Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0
1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5
Median :19.20 Median :6.000 Median :196.3 Median :123.0
Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7
3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0
Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0
drat wt qsec vs
Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000
1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000
Median :3.695 Median :3.325 Median :17.71 Median :0.0000
Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375
3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000
Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000
am gear carb
Min. :0.0000 Min. :3.000 Min. :1.000
1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000
Median :0.0000 Median :4.000 Median :2.000
Mean :0.4062 Mean :3.688 Mean :2.812
3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000
Max. :1.0000 Max. :5.000 Max. :8.000
Example 1: Simple Linear Regression
# Fit a simple linear regression modelmodel <-lm(mpg ~ hp, data = mtcars)# Print the model summarysummary(model)
Call:
lm(formula = mpg ~ hp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-5.7121 -2.1122 -0.8854 1.5819 8.2360
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.09886 1.63392 18.421 < 2e-16 ***
hp -0.06823 0.01012 -6.742 1.79e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.863 on 30 degrees of freedom
Multiple R-squared: 0.6024, Adjusted R-squared: 0.5892
F-statistic: 45.46 on 1 and 30 DF, p-value: 1.788e-07
The summary output provides information about the coefficients, standard errors, t-values, and p-values. In this case, the intercept represents the estimated baseline miles per gallon (mpg) when horsepower is zero. The coefficient for horsepower indicates the estimated change in mpg for each unit increase in horsepower.
Example 2: Multiple Linear Regression
# Fit a multiple linear regression modelmodel <-lm(mpg ~ hp + wt, data = mtcars)# Print the model summarysummary(model)
Call:
lm(formula = mpg ~ hp + wt, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.941 -1.600 -0.182 1.050 5.854
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
hp -0.03177 0.00903 -3.519 0.00145 **
wt -3.87783 0.63273 -6.129 1.12e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
The summary output provides interpretation for each coefficient. For example, the coefficient for horsepower represents the estimated change in mpg for each unit increase in horsepower, holding weight constant. Similarly, the coefficient for weight represents the estimated change in mpg for each unit increase in weight, holding horsepower constant.
Example 3: Categorical Predictor
# Fit a linear regression model with a categorical predictormodel <-lm(mpg ~factor(cyl), data = mtcars)# Print the model summarysummary(model)
Call:
lm(formula = mpg ~ factor(cyl), data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-5.2636 -1.8357 0.0286 1.3893 7.2364
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 26.6636 0.9718 27.437 < 2e-16 ***
factor(cyl)6 -6.9208 1.5583 -4.441 0.000119 ***
factor(cyl)8 -11.5636 1.2986 -8.905 8.57e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.223 on 29 degrees of freedom
Multiple R-squared: 0.7325, Adjusted R-squared: 0.714
F-statistic: 39.7 on 2 and 29 DF, p-value: 4.979e-09
When a categorical predictor, such as “cyl” (number of cylinders), is included in the model, R automatically treats it as a set of dummy variables. The summary output provides the coefficients for each category level (e.g., 4 cylinders, 6 cylinders, 8 cylinders). These coefficients represent the estimated difference in the response variable (mpg) compared to the reference category (usually the intercept).
Example 4: Interaction Effect
# Fit a linear regression model with an interaction termmodel <-lm(mpg ~ hp * wt, data = mtcars)# Print the model summarysummary(model)
Call:
lm(formula = mpg ~ hp * wt, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.0632 -1.6491 -0.7362 1.4211 4.5513
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 49.80842 3.60516 13.816 5.01e-14 ***
hp -0.12010 0.02470 -4.863 4.04e-05 ***
wt -8.21662 1.26971 -6.471 5.20e-07 ***
hp:wt 0.02785 0.00742 3.753 0.000811 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.153 on 28 degrees of freedom
Multiple R-squared: 0.8848, Adjusted R-squared: 0.8724
F-statistic: 71.66 on 3 and 28 DF, p-value: 2.981e-13
When an interaction term (e.g., horsepower * weight) is included in the model, the summary output provides coefficients for both main effects (horsepower and weight) as well as the interaction term. The interaction coefficient represents the change in the relationship between mpg and horsepower as weight increases.
# do some exploratory analysis on the survey data in the MASS packagelibrary(dplyr) survey
Sex Wr.Hnd NW.Hnd W.Hnd Fold Pulse Clap Exer Smoke Height M.I
1 Female 18.5 18.0 Right R on L 92 Left Some Never 173.00 Metric
2 Male 19.5 20.5 Left R on L 104 Left None Regul 177.80 Imperial
3 Male 18.0 13.3 Right L on R 87 Neither None Occas NA <NA>
4 Male 18.8 18.9 Right R on L NA Neither None Never 160.00 Metric
5 Male 20.0 20.0 Right Neither 35 Right Some Never 165.00 Metric
6 Female 18.0 17.7 Right L on R 64 Right Some Never 172.72 Imperial
7 Male 17.7 17.7 Right L on R 83 Right Freq Never 182.88 Imperial
8 Female 17.0 17.3 Right R on L 74 Right Freq Never 157.00 Metric
9 Male 20.0 19.5 Right R on L 72 Right Some Never 175.00 Metric
10 Male 18.5 18.5 Right R on L 90 Right Some Never 167.00 Metric
11 Female 17.0 17.2 Right L on R 80 Right Freq Never 156.20 Imperial
12 Male 21.0 21.0 Right R on L 68 Left Freq Never NA <NA>
13 Female 16.0 16.0 Right L on R NA Right Some Never 155.00 Metric
14 Female 19.5 20.2 Right L on R 66 Neither Some Never 155.00 Metric
15 Male 16.0 15.5 Right R on L 60 Right Some Never NA <NA>
16 Female 17.5 17.0 Right R on L NA Right Freq Never 156.00 Metric
17 Female 18.0 18.0 Right L on R 89 Neither Freq Never 157.00 Metric
18 Male 19.4 19.2 Left R on L 74 Right Some Never 182.88 Imperial
19 Male 20.5 20.5 Right L on R NA Left Some Never 190.50 Imperial
20 Male 21.0 20.9 Right R on L 78 Right Freq Never 177.00 Metric
21 Male 21.5 22.0 Right R on L 72 Left Freq Never 190.50 Imperial
22 Male 20.1 20.7 Right L on R 72 Right Freq Never 180.34 Imperial
23 Male 18.5 18.0 Right L on R 64 Right Freq Never 180.34 Imperial
24 Male 21.5 21.2 Right R on L 62 Right Some Never 184.00 Metric
25 Female 17.0 17.5 Right R on L 64 Left Some Never NA <NA>
26 Male 18.5 18.5 Right Neither 90 Neither Some Never NA <NA>
27 Male 21.0 20.7 Right R on L 90 Right Some Never 172.72 Imperial
28 Male 20.8 21.4 Right R on L 62 Neither Freq Never 175.26 Imperial
29 Male 17.8 17.8 Right L on R 76 Neither Freq Never NA <NA>
30 Male 19.5 19.5 Right L on R 79 Right Some Never 167.00 Metric
31 Female 18.5 18.0 Right R on L 76 Right None Occas NA <NA>
32 Male 18.8 18.2 Right L on R 78 Right Freq Never 180.00 Metric
33 Female 17.1 17.5 Right R on L 72 Right Freq Heavy 166.40 Imperial
34 Male 20.1 20.0 Right R on L 70 Right Some Never 180.00 Metric
35 Male 18.0 19.0 Right L on R 54 Neither Some Regul NA <NA>
36 Male 22.2 21.0 Right L on R 66 Right Freq Occas 190.00 Metric
37 Female 16.0 16.5 Right L on R NA Right Some Never 168.00 Metric
38 Male 19.4 18.5 Right R on L 72 Neither Freq Never 182.50 Metric
39 Male 22.0 22.0 Right R on L 80 Right Some Never 185.00 Metric
40 Male 19.0 19.0 Right R on L NA Neither Freq Occas 171.00 Metric
41 Female 17.5 16.0 Right L on R NA Right Some Never 169.00 Metric
42 Female 17.8 18.0 Right R on L 72 Right Some Never 154.94 Imperial
43 Male NA NA Right R on L 60 <NA> Some Never 172.00 Metric
44 Female 20.1 20.2 Right L on R 80 Right Some Never 176.50 Imperial
45 Female 13.0 13.0 <NA> L on R 70 Left Freq Never 180.34 Imperial
46 Male 17.0 17.5 Right R on L NA Neither Freq Never 180.34 Imperial
47 Male 23.2 22.7 Right L on R 84 Left Freq Regul 180.00 Metric
48 Male 22.5 23.0 Right R on L 96 Right None Never 170.00 Metric
49 Female 18.0 17.6 Right R on L 60 Right Some Occas 168.00 Metric
50 Female 18.0 17.9 Right R on L 50 Left None Never 165.00 Metric
51 Male 22.0 21.5 Left R on L 55 Left Freq Never 200.00 Metric
52 Male 20.5 20.0 Right L on R 68 Right Freq Never 190.00 Metric
53 Male 17.0 18.0 Right L on R 78 Left Some Never 170.18 Imperial
54 Male 20.5 19.5 Right L on R 56 Right Freq Never 179.00 Metric
55 Male 22.5 22.5 Right R on L 65 Right Freq Regul 182.00 Metric
56 Male 18.5 18.5 Right L on R NA Neither Freq Never 171.00 Metric
57 Female 15.5 15.4 Right R on L 70 Neither None Never 157.48 Imperial
58 Male 19.5 19.7 Right R on L 72 Right Freq Never NA <NA>
59 Male 19.5 19.0 Right L on R 62 Right Freq Never 177.80 Imperial
60 Male 20.6 21.0 Left L on R NA Left Freq Occas 175.26 Imperial
61 Male 22.8 23.2 Right R on L 66 Neither Freq Never 187.00 Metric
62 Female 18.5 18.2 Right R on L 72 Neither Freq Never 167.64 Imperial
63 Female 19.6 19.7 Right L on R 70 Right Freq Never 178.00 Metric
64 Female 18.7 18.0 Left L on R NA Left None Never 170.00 Metric
65 Female 17.3 18.0 Right L on R 64 Neither Freq Never 164.00 Metric
66 Male 19.5 19.8 Right Neither NA Right Freq Never 183.00 Metric
67 Female 19.0 19.1 Right L on R NA Neither Freq Never 172.00 Metric
68 Female 18.5 18.0 Right R on L 64 Right Freq Never NA <NA>
69 Male 19.0 19.0 Right L on R NA Right Some Never 180.00 Metric
70 Male 21.0 19.5 Right L on R 80 Left None <NA> NA <NA>
71 Female 18.0 17.5 Right L on R 64 Left Freq Never 170.00 Metric
72 Male 19.4 19.5 Right R on L NA Right Freq Heavy 176.00 Metric
73 Female 17.0 16.6 Right R on L 68 Right Some Never 171.00 Metric
74 Female 16.5 17.0 Right L on R 40 Left Freq Never 167.64 Imperial
75 Female 15.6 15.8 Right R on L 88 Left Some Never 165.00 Metric
76 Female 17.5 17.5 Right Neither 68 Right Freq Heavy 170.00 Metric
77 Female 17.0 17.6 Right L on R 76 Right Some Never 165.00 Metric
78 Female 18.6 18.0 Right L on R NA Neither Freq Heavy 165.10 Imperial
79 Female 18.3 18.5 Right R on L 68 Neither Some Never 165.10 Imperial
80 Male 20.0 20.5 Right L on R NA Right Freq Never 185.42 Imperial
81 Male 19.5 19.5 Left R on L 66 Left Some Never NA <NA>
82 Male 19.2 18.9 Right R on L 76 Right Freq Never 176.50 Imperial
83 Female 17.5 17.5 Right R on L 98 Left Freq Never NA <NA>
84 Female 17.0 17.4 Right R on L NA Neither Some Never NA <NA>
85 Male 23.0 23.5 Right L on R 90 Right Freq Never 167.64 Imperial
86 Female 17.7 17.0 Right R on L 76 Right Some Never 167.00 Metric
87 Female 18.2 18.0 Right L on R 70 Right Some Never 162.56 Imperial
88 Female 18.3 18.5 Right R on L 75 Left Freq Never 170.00 Metric
89 Male 18.0 18.0 Right Neither 60 Right Freq Never 179.00 Metric
90 Female 18.0 17.7 Left R on L 92 Left Some Never NA <NA>
91 Male 20.5 20.0 Right R on L 75 Left Some Never 183.00 Metric
92 Female 17.5 18.0 Right Neither NA Right Some Never NA <NA>
93 Female 18.2 17.5 Right L on R 70 Right Some Never 165.00 Metric
94 Female 18.2 18.5 Right R on L NA Right Some Never 168.00 Metric
95 Male 21.3 20.8 Right R on L 65 Right Freq Heavy 179.00 Metric
96 Female 19.0 18.8 Right L on R NA Right Some Never NA <NA>
97 Male 20.0 19.5 Right R on L 68 Neither Freq Regul 190.00 Metric
98 Female 17.5 17.5 Right R on L 60 Right Freq Never 166.50 Metric
99 Male 19.5 19.4 Right Neither NA Right Freq Never 165.00 Metric
100 Female 19.4 19.6 Right R on L 68 Neither Freq Never 175.26 Imperial
101 Male 21.9 22.2 Right R on L NA Right Some Never 187.00 Metric
102 Male 18.9 19.1 Right L on R 60 Neither None Never 170.00 Metric
103 Female 16.0 16.0 Right Neither NA Right Some Never 159.00 Metric
104 Female 17.5 17.3 Right R on L 72 Right Freq Never 175.00 Metric
105 Female 17.5 17.0 Right R on L 80 Left Some Heavy 163.00 Metric
106 Female 19.5 18.5 Right R on L 80 Right Some Never 170.00 Metric
107 Female 16.2 16.4 Right R on L NA Right Freq Occas 172.00 Metric
108 Female 17.0 15.9 Right R on L 85 Right Freq Never NA <NA>
109 Male 17.5 17.5 Right L on R 64 Neither Freq Never 180.00 Metric
110 Male 19.7 20.1 Right R on L 67 Left Some Regul 180.34 Imperial
111 Female 18.5 18.5 Right R on L 76 Left Freq Never 175.00 Metric
112 Male 19.2 19.6 Right L on R 80 Right None Never 190.50 Imperial
113 Female 17.2 16.7 Right R on L 75 Right Freq Never 170.18 Imperial
114 Male 20.5 21.0 Right R on L 60 Right Freq Never 185.00 Metric
115 Female 16.0 15.5 Right L on R 60 Left Freq Never 162.56 Imperial
116 Female 16.9 16.0 Right L on R 70 Right None Never 158.00 Metric
117 Female 17.0 16.7 Right R on L 70 Right Some Never 159.00 Metric
118 Male 23.0 22.0 Left L on R 83 Left Some Heavy 193.04 Imperial
119 Female 18.5 18.0 Left L on R 100 Neither Some Never 171.00 Metric
120 Male 21.0 20.4 Right L on R 100 Right Freq Heavy 184.00 Metric
121 Male 20.0 20.0 Right R on L 80 Neither Freq Occas NA <NA>
122 Male 22.5 22.5 Right L on R 76 Right Freq Occas 177.00 Metric
123 Female 18.5 18.0 Right R on L 92 Right Freq Never 172.00 Metric
124 Male 19.8 20.0 Left L on R 59 Right Freq Never 180.00 Metric
125 Male 18.5 18.1 Right L on R 66 Left Freq Never 175.26 Imperial
126 Male 19.3 19.4 Right R on L NA Right Freq Never 180.34 Imperial
127 Female 16.0 16.0 Right R on L 68 Right Freq Never 172.72 Imperial
128 Male 18.8 19.1 Right L on R 66 Neither Freq Regul 178.50 Metric
129 Female 17.5 17.0 Right R on L 74 Right Freq Never 157.00 Metric
130 Female 16.4 16.5 Right L on R 90 Right Some Never 152.00 Metric
131 Male 22.0 21.5 Right R on L 86 Right Freq Never 187.96 Imperial
132 Male 19.0 19.5 Right L on R 60 Right Some Never 178.00 Metric
133 Female 18.9 20.0 Right R on L 86 Right Some Never NA <NA>
134 Female 15.4 16.4 Left L on R 80 Left Freq Occas 160.02 Imperial
135 Male 17.9 17.8 Right R on L 85 Left Some Never 175.26 Imperial
136 Male 23.1 22.5 Right L on R 90 Right Some Regul 189.00 Metric
137 <NA> 19.8 19.0 Left L on R 73 Neither Freq Never 172.00 Metric
138 Male 22.0 22.0 Right L on R 72 Right Freq Never 182.88 Imperial
139 Male 20.0 19.5 Right L on R NA Right Freq Never 170.00 Metric
140 Female 19.5 18.5 Right L on R 68 Right None Never 167.00 Metric
141 Female 18.0 18.6 Right R on L 84 Right Some Never 175.00 Metric
142 Female 18.3 19.0 Right R on L NA Right None Never 165.00 Metric
143 Female 19.0 18.8 Right R on L 65 Right Freq Never 172.72 Imperial
144 Male 21.4 21.0 Right L on R 96 Neither Some Never 180.00 Metric
145 Female 20.0 19.5 Left R on L 68 Neither Freq Never 172.00 Metric
146 Male 18.5 18.5 Right R on L 75 Neither Some Never 185.00 Metric
147 Male 22.5 22.6 Right L on R 64 Right Freq Regul 187.96 Imperial
148 Male 19.5 20.2 Right R on L 60 Neither Freq Never 185.42 Imperial
149 Female 18.0 18.0 Right L on R 92 Neither Freq Never 165.00 Metric
150 Female 18.0 18.5 Right R on L 64 Neither Freq Never 164.00 Metric
151 Male 21.8 22.3 Right R on L 76 Left Freq Never 195.00 Metric
152 Female 13.0 12.5 Right L on R 80 Right Freq Never 165.00 Metric
153 Female 16.3 16.2 Right L on R 92 Right Some Regul 152.40 Imperial
154 Male 21.5 21.6 Right R on L 69 Right Freq Never 172.72 Imperial
155 Male 18.9 19.1 Right L on R 68 Right None Never 180.34 Imperial
156 Male 20.5 20.0 Right R on L 76 Right Freq Never 173.00 Metric
157 Male 14.0 15.5 Right L on R NA Neither Freq Heavy NA <NA>
158 Female 18.9 19.2 Right L on R 74 Right Some Never 167.64 Imperial
159 Male 20.0 20.5 Right R on L NA Right None Never 187.96 Imperial
160 Male 18.5 19.0 Right L on R 84 Right Freq Regul 187.00 Metric
161 Female 17.5 17.1 Right R on L 80 Left None Never 167.00 Metric
162 Male 18.1 18.2 Left Neither NA Right Some Never 168.00 Metric
163 Male 20.2 20.3 Right L on R 72 Neither Some Never 191.80 Imperial
164 Female 16.5 16.9 Right R on L 60 Neither Freq Occas 169.20 Metric
165 Male 19.1 19.1 Right Neither NA Right Some Never 177.00 Metric
166 Female 17.6 17.2 Right R on L 81 Left Some Never 168.00 Metric
167 Female 19.5 19.2 Right R on L 70 Right Some Never 170.00 Metric
168 Female 16.5 15.0 Right L on R 65 Right Some Regul 160.02 Imperial
169 Male 19.0 18.5 Right L on R NA Neither Freq Never 189.00 Metric
170 Male 19.0 18.5 Right R on L 72 Right Freq Never 180.34 Imperial
171 Female 16.5 17.0 Right L on R NA Right Some Never 168.00 Metric
172 Male 20.5 19.5 Left L on R 80 Right Some Occas 182.88 Imperial
173 Female 15.5 15.5 Right Neither 50 Right Some Regul NA <NA>
174 Female 18.0 17.5 Right R on L 48 Neither Freq Never 165.00 Metric
175 Female 17.5 18.0 Right R on L 68 Neither Freq Never 157.48 Imperial
176 Female 19.0 18.5 Left L on R 104 Left Freq Never 170.00 Metric
177 Male 20.5 20.5 Right Neither 76 Right Freq Regul 172.72 Imperial
178 Female 16.7 17.0 Right L on R 84 Left Freq Never 164.00 Metric
179 Female 20.5 20.5 Right R on L NA Left Freq Regul NA <NA>
180 Female 17.0 16.5 Right R on L 70 Right Some Never 162.56 Imperial
181 Male 19.0 19.5 Right R on L 68 Right Freq Occas 172.00 Metric
182 Female 14.0 13.5 Right R on L 87 Neither Freq Occas 165.10 Imperial
183 Female 17.5 17.6 Right L on R 79 Right Some Never 162.50 Metric
184 Male 18.5 19.0 Right L on R 70 Left Freq Never 170.00 Metric
185 Male 18.0 18.5 Right Neither 90 Right Some Never 175.00 Metric
186 Male 20.5 20.7 Right R on L 72 Right Some Never 168.00 Metric
187 Female 17.0 17.0 Right L on R 79 Right Some Never 163.00 Metric
188 Male 18.5 18.5 Right R on L 65 Right None Never 165.00 Metric
189 Male 18.0 18.5 Right R on L 62 Right Freq Never 173.00 Metric
190 Male 18.5 18.0 Right Neither 63 Neither Freq Never 196.00 Metric
191 Male 20.0 19.5 Right R on L 92 Right Some Never 179.10 Imperial
192 Male 22.0 22.5 Right L on R 60 Right Some Never 180.00 Metric
193 Male 17.9 18.4 Right R on L 68 Left None Occas 176.00 Metric
194 Female 17.6 17.8 Right L on R 72 Left Some Never 160.02 Imperial
195 Female 16.7 15.1 Right Neither NA Right None Never 157.48 Imperial
196 Female 17.0 17.6 Right L on R 76 Right Some Never 165.00 Metric
197 Female 15.0 13.0 Right R on L 80 Neither Freq Never 170.18 Imperial
198 Male 16.0 15.5 Right Neither 71 Right Freq Never 154.94 Imperial
199 Female 19.1 19.0 Right R on L 80 Right Some Occas 170.00 Metric
200 Female 17.5 16.5 Right R on L 80 Neither Some Never 164.00 Metric
201 Female 16.2 15.8 Right R on L 61 Right Some Occas 167.00 Metric
202 Male 21.0 21.0 Right L on R 48 Neither Freq Never 174.00 Metric
203 Female 18.8 17.8 Right R on L 76 Right Some Never NA <NA>
204 Female 18.5 18.0 Right Neither 86 Right None Never 160.00 Metric
205 Male 17.0 17.5 Right R on L 80 Right Some Regul 179.10 Metric
206 Female 17.5 17.0 Right R on L 83 Neither Freq Occas 168.00 Metric
207 Female 17.5 17.6 Right L on R 76 Right Some Never 153.50 Metric
208 Male 17.5 17.6 Right R on L 84 Right Some Never 160.00 Metric
209 Male 17.5 17.0 Left L on R 97 Neither None Never 165.00 Metric
210 Female 20.8 20.7 Right R on L NA Neither Freq Never 171.50 Metric
211 Female 18.6 18.6 Right L on R 74 Right Some Never 160.00 Metric
212 Female 17.5 17.5 Left R on L 83 Neither Some Never 163.00 Metric
213 Male 18.0 18.5 Right R on L 78 Right Freq Never NA <NA>
214 Male 17.0 17.5 Right R on L 65 Right Some Never 165.00 Metric
215 Female 18.0 17.8 Right L on R 68 Right Some Never 168.90 Imperial
216 Male 19.5 20.0 Right Neither NA Right Some Never 170.00 Metric
217 Female 16.3 16.2 Right L on R NA Right None Never NA <NA>
218 Male 18.2 19.8 Right R on L 88 Right Freq Never 185.00 Metric
219 Female 17.0 17.3 Right L on R NA Neither Freq Never 173.00 Metric
220 Male 23.2 23.2 Right L on R 75 Right Freq Never 188.00 Metric
221 Male 23.2 23.3 Right L on R NA Right None Heavy 171.00 Metric
222 Female 15.9 16.5 Right R on L 70 Right Freq Never 167.64 Imperial
223 Female 17.5 18.4 Right R on L 88 Right Some Never 162.56 Imperial
224 Female 17.5 17.6 Right L on R NA Right Freq Never 150.00 Metric
225 Female 17.6 17.2 Right L on R NA Right Some Never NA <NA>
226 Female 17.5 17.8 Right R on L 96 Right Some Never NA <NA>
227 Female 18.8 18.3 Right R on L 80 Right Some Heavy 170.18 Imperial
228 Male 20.0 19.8 Right L on R 68 Right Freq Never 185.00 Metric
229 Female 18.6 18.8 Right L on R 70 Right Freq Regul 167.00 Metric
230 Male 18.6 19.6 Right L on R 71 Right Freq Occas 185.00 Metric
231 Female 18.8 18.5 Right R on L 80 Right Some Never 169.00 Metric
232 Male 18.0 16.0 Right R on L NA Right Some Never 180.34 Imperial
233 Female 18.0 18.0 Right L on R 85 Right Some Never 165.10 Imperial
234 Female 18.5 18.0 Right L on R 88 Right Some Never 160.00 Metric
235 Female 17.5 16.5 Right R on L NA Right Some Never 170.00 Metric
236 Male 21.0 21.5 Right R on L 90 Right Some Never 183.00 Metric
237 Female 17.6 17.3 Right R on L 85 Right Freq Never 168.50 Metric
Age
1 18.250
2 17.583
3 16.917
4 20.333
5 23.667
6 21.000
7 18.833
8 35.833
9 19.000
10 22.333
11 28.500
12 18.250
13 18.750
14 17.500
15 17.167
16 17.167
17 19.333
18 18.333
19 19.750
20 17.917
21 17.917
22 18.167
23 17.833
24 18.250
25 19.167
26 17.583
27 17.500
28 18.083
29 21.917
30 19.250
31 41.583
32 17.500
33 39.750
34 17.167
35 17.750
36 18.000
37 19.000
38 17.917
39 35.500
40 19.917
41 17.500
42 17.083
43 28.583
44 17.500
45 17.417
46 18.500
47 18.917
48 19.417
49 18.417
50 30.750
51 18.500
52 17.500
53 18.333
54 17.417
55 20.000
56 18.333
57 17.167
58 17.417
59 17.667
60 18.417
61 20.333
62 17.333
63 17.500
64 19.833
65 18.583
66 18.000
67 30.667
68 16.917
69 19.917
70 18.333
71 17.583
72 17.833
73 17.667
74 17.417
75 17.750
76 20.667
77 23.583
78 17.167
79 17.083
80 18.750
81 16.750
82 20.167
83 17.667
84 17.167
85 17.167
86 17.250
87 18.000
88 18.750
89 21.583
90 17.583
91 19.667
92 18.000
93 19.667
94 17.083
95 22.833
96 17.083
97 19.417
98 23.250
99 18.083
100 19.083
101 18.917
102 17.750
103 20.833
104 20.167
105 17.667
106 18.250
107 17.000
108 18.500
109 18.583
110 17.750
111 24.167
112 18.167
113 21.167
114 17.917
115 17.417
116 20.500
117 22.917
118 18.917
119 18.917
120 20.083
121 17.500
122 18.250
123 17.500
124 17.417
125 21.000
126 19.833
127 17.667
128 18.083
129 18.000
130 18.333
131 20.000
132 18.750
133 19.083
134 18.500
135 18.417
136 19.167
137 21.500
138 19.333
139 21.417
140 18.667
141 17.500
142 21.083
143 17.250
144 19.000
145 19.167
146 19.000
147 23.000
148 32.667
149 20.000
150 20.167
151 25.500
152 18.167
153 23.500
154 70.417
155 43.833
156 23.583
157 21.083
158 44.250
159 19.667
160 17.917
161 18.417
162 21.167
163 17.500
164 29.083
165 19.917
166 18.500
167 18.167
168 32.750
169 17.417
170 17.333
171 73.000
172 18.667
173 18.500
174 18.667
175 17.750
176 17.250
177 36.583
178 23.083
179 19.250
180 17.167
181 23.417
182 17.083
183 17.250
184 23.833
185 18.750
186 21.167
187 24.667
188 18.500
189 20.333
190 20.083
191 18.917
192 27.333
193 18.917
194 17.250
195 18.167
196 26.500
197 17.000
198 17.167
199 19.167
200 17.500
201 19.250
202 21.333
203 18.583
204 20.167
205 18.667
206 17.083
207 17.417
208 18.583
209 19.500
210 18.500
211 17.167
212 17.250
213 17.500
214 20.417
215 17.083
216 21.250
217 19.250
218 19.333
219 19.167
220 18.917
221 20.917
222 17.333
223 18.167
224 20.750
225 19.917
226 18.667
227 18.417
228 17.417
229 20.333
230 19.333
231 18.167
232 20.750
233 17.667
234 16.917
235 18.583
236 17.167
237 17.750
survey <-as_tibble(survey)# check the structure of the datastr(survey)